rank | frequency | n-gram |
---|---|---|
1 | 44124 | -e |
2 | 39540 | -s |
3 | 21178 | -t |
4 | 16289 | -n |
5 | 15255 | -r |
rank | frequency | n-gram |
---|---|---|
1 | 15612 | -es |
2 | 11475 | -nt |
3 | 9934 | -er |
4 | 7342 | -on |
5 | 5461 | -re |
rank | frequency | n-gram |
---|---|---|
1 | 7034 | -ent |
2 | 4214 | -ion |
3 | 3241 | -ant |
4 | 2997 | -ons |
5 | 2295 | -ait |
rank | frequency | n-gram |
---|---|---|
1 | 3397 | -tion |
2 | 2664 | -ment |
3 | 1568 | -ique |
4 | 1543 | -ions |
5 | 1428 | -ient |
rank | frequency | n-gram |
---|---|---|
1 | 2377 | -ation |
2 | 2236 | -ement |
3 | 1187 | -aient |
4 | 1072 | -tions |
5 | 737 | -iques |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings